-
Notifications
You must be signed in to change notification settings - Fork 83
feat: add FastAppend #516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add FastAppend #516
Conversation
869196d to
7cfc0e9
Compare
Change DataFileSet from std::unordered_set to a custom class that preserves insertion order, similar to Java's DataFileSet which uses LinkedHashSet. This is important for row ID assignment in v3 manifests, where row IDs are assigned based on the order files are written. The implementation uses both a vector (for insertion order) and an unordered_set (for O(1) duplicate detection) to maintain the same API while preserving order.
5faf24c to
7118dcc
Compare
| #include "iceberg/result.h" | ||
| #include "iceberg/type_fwd.h" | ||
| #include "iceberg/update/snapshot_update.h" | ||
| #include "iceberg/util/content_file_util.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Consider moving DataFileSet out of content_file_util.h to avoid including more headers than needed. We may also need some test cases of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, do you mind if I do the refactor in a separate PR, along with some of the other TODOs we mentioned in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, feel free to add them as followups.
|
Thanks a lot for working on this! This completes the 0.2.0 milestone. |
No description provided.